Consider Note that RHS depends on so 's distribution changes with , so we can't say "iid".
Parameters here are .
Without regularization, MLE gets . So we define and , which minimize and We already know from here that where and respectively minimize
This model is an example of the "mean model" where focus is on . The next two models will be examples of "variance models".
2 Model Two
Consider The likelihood is Then form the sufficient statistic. Under (2.1), Note that density of is proportional to The likelihood of is thus Dropping , we have (2.2).
The log-likelihood is Here we use , because it removes the constraint , and improves computational stability. A simple minimization without regularization gets . So we introduce regularization. Assume is smooth: , which minimizes and
3 Model Three
Apply model two with DFT. Recall that for , its DFT is , where Assume is odd, . So we obtain model three from model two for the DFT terms , and assume Also assume are independent across . The unknown parameters here are . represents the strength of sinusoids at frequency . The likelihood is
Recall: periodogram is defined as We can therefore rewrite the likelihood: So the periodogram forms the sufficient statistic in this model. Further So we rewrite the model as The negative log-likelihood is where . Directly minimize it, we have Next regularize it: , : and